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Abstract 

Background: Network inference deals with the reconstruction of nnolecular networks from experimental data. Given 
N molecular species, the challenge is to find the underlying network. Due to data limitations, this typically is an ill- 
posed problem, and requires the integration of prior biological knowledge or strong regularization. We here focus on 
the situation when time-resolved measurements of a system's response after systematic perturbations are available. 

Results: We present a novel method to infer signaling networks from time-course perturbation data. We utilize 
dynamic Bayesian networks with probabilistic Boolean threshold functions to describe protein activation. The model 
posterior distribution is analyzed using evolutionary MCMC sampling and subsequent clustering, resulting in 
probability distributions over alternative networks. We evaluate our method on simulated data, and study its 
performance with respect to data set size and levels of noise. We then use our method to study EGF-mediated 
signaling in the ERBB pathway. 

Conclusions: Dynamic Probabilistic Threshold Networks is a new method to infer signaling networks from 
time-series perturbation data. It exploits the dynamic response of a system after external perturbation for network 
reconstruction. On simulated data, we show that the approach outperforms current state of the art methods. On the 
ERBB data, our approach recovers a significant fraction of the known interactions, and predicts novel mechanisms in 
the ERBB pathway. 



Background 

The availability of high throughput experimental plat- 
forms has transformed molecular biology into a data-rich 
science. However, the development of computational and 
mathematical tools to extract and interpret the wealth 
of information hidden in these data is lagging behind. 
For example, genome wide RNA interference screens 
have enabled the phenotypic characterization of genes 
at an unprecedented scale in living cells [1]. However, 
the interpretation of such data and the placement of 
hits in their functional and temporal context in cellu- 
lar pathways remains a major challenge [2,3]. Machine 
learning can address many of the questions arising from 
the interpretation of large scale molecular biological data, 
and has become an important tool of bioinformatics 
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and systems biology. Automatic network inference deals 
with the reconstruction of regulatory or signaling net- 
works directly from experimental data using statistical or 
machine learning approaches, a field that has received 
significant attention in the last decade. There is a wide 
variety of different approaches available that can be used 
to infer genetic regulatory or signal transduction networks 
from experimental data. Approaches employed include 
Bayesian networks [4-13] and dynamic Bayesian networks 
[14-18], Boolean models [19-21], auto-regressive models, 
correlation-based and mutual-information based models, 
clustering techniques, differential equation models, and 
others [22-27]. These methods differ in the level of detail 
at which they reconstruct networks, and in their under- 
lying assumptions and data requirements. Some produce 
an undirected graph, where edges do not indicate which 
gene or protein in a connected pair is the activator, and 
which is the activated gene or protein [23] . Others specify 
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the regulator with directed edges [14], and a few label the 
edges with kinetic parameters [28]. 

There are a number of successful applications of net- 
work inference approaches to elucidate cellular signaling 
pathways, including meta-approaches integrating differ- 
ent methods [29]. Among the successful applications, for 
example, Sachs et al used Bayesian networks to recon- 
struct cellular protein signaling networks from protein 
phosphorylation measurements [5]. Nelander et al used 
a model based on nonlinear differential equations to 
infer signaling networks in cancer from combinatorial 
drug perturbation data [30]. Eduati et al. demonstrate 
the integration of literature-knowledge into data driven 
approaches, and show a successful application to a sig- 
naling network related to growth- signaling and inflam- 
mation [31]. Ciaccio et al, used Bayesian networks and 
two different mutual-information based approaches to 
infer signaling networks downstream of the EGF recep- 
tor [32]. Hill et al. use dynamic Bayesian networks to 
study signaling in a cancer cell line [18]. Other approaches 
include nested effects models (NEM) [33], deterministic 
effects propagation networks (DEPN) [34] or probabilistic 
Boolean threshold networks (PBTN) [35], and have been 
applied, for example, to reconstruct signaling networks in 
the ERBB pathway, or in the innate immune response to 
infection. 

Network reconstruction can be performed from obser- 
vational data alone. However, the quality of the recon- 
struction increases substantially if experimental pertur- 
bations followed by observations of the systems dynamic 
response are available [36]. Surprisingly, while there are 
many approaches available for time course data, and sev- 
eral approaches for perturbation data, there are only few 
methods available that can handle both time course data 
and perturbations at the same time. Exceptions are, for 
example, the differential equation approach presented by 
Nelander et al. [30], Dynamic Nested Effects Models 
(DNEM) [37,38] or Dynamic Deterministic Effects Prop- 
agation Models (D-DEPN) [39]. Among the stochastic 
approaches, DNEMs rely on high-dimensional, indirect 
readouts of rather qualitative knockdown "effects", such 
as microarrays performed at different time points after 
every knockdown, or multidimensional features derived 
from live cell imaging. Such data are often not avail- 
able, and also provide only indirect information about 
the underlying signaling pathway. Frohlich et al. proposed 
Deterministic Effects Propagation Networks (DEPN) to 
reconstruct networks from perturbation data with direct 
observation of involved proteins, measured e.g. using 
reverse phase protein arrays [34]. However, DEPNs treat 
each time point as an independent measurement and 
do not model the time dependent behavior of the sys- 
tem explicitly. Bender et al. proposed Dynamic DEPNs 
(D-DEPNs), to take the availability of longitudinal data as 



well as inhibitory interactions in a network into account 
[39]. D-DEPNs use the Viterbi algorithm to identify the 
state transitions of a hidden Markov model from the time 
course data, with states corresponding to combinations of 
activities of nodes in the network. A likelihood function 
is defined to score network models given the estimated 
state transitions, and the network space is then searched 
using a genetic algorithm. Due to this underlying pro- 
cedure, D-DEPNs require relatively long time series (10 
time points in the original publication of the method), and 
have substantial running time requirements if large num- 
bers of different perturbations involving most or all of the 
proteins in a network are performed. 

We here focus on the problem of reconstructing sig- 
naling networks from short time series with only two 
or three time points, after a large number of different, 
possibly combinatorial perturbations, targeting most or 
all of the genes in the network. Such data arise, for 
example, if RNAi experiments are combined with protein 
array measurements at few time points after the per- 
turbation, or in time-resolved mass spectrometric assays 
under different conditions. We propose a new method 
for the identification of kinetic models of signaling net- 
works from such data, dynamic probabilistic Boolean 
threshold networks (D-PBTN), which can treat multi- 
ple, combinatorial interventions as well as incomplete 
observations. Through the use of a fairly simple, discrete- 
state model where proteins are either active or inactive, 
our method is applicable to qualitative data and does 
not require detailed quantitative measurements. Further- 
more, through the Bayesian framework employed, we can 
easily handle noisy or missing data, and using MCMC 
sampling, can analyze full posterior distributions over 
model parameters, yielding information about different, 
alternative network topologies that are consistent with 
the experimental data. Our work builds upon probabiUs- 
tic Boolean threshold networks (PBTN) regarding the 
underlying dynamic network model [35], but extends the 
method in the following aspects: (1) We fully exploit the 
information from time course data, by explicitely integrat- 
ing time into the model likelihood. In contrast, PBTNs 
in their original implementation only exploit observa- 
tional data taken at a single time point, usually at steady 
state. (2) PBTNs are limited to relatively small networks 
of around 7 nodes. Through a more efficient sampling 
method, the approach presented here can be used for 
larger networks, effectively more than doubling the fea- 
sible network size. (3) PBTNs were implemented with a 
focus on data with only single downstream observations at 
a single time point, and therefore similarly to other (non- 
dynamic) Bayesian network based approaches, do not 
allow feedback loops in the network. The approach pre- 
sented here can handle feedback loops in the network. (4) 
D-PBTNs can handle overexpression and knockdown 
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data, in contrast to PBTNs, which are limited to knock- 
down data. 

Our approach uses discrete states of the proteins in the 
network, which can be either "on" or "off". In terms of the 
biological interpretation, this could be phosphorylation 
networks, in which case edges would correspond to phos- 
phorylation or dephosphorylation events. Alternatively, 
assuming that proteins are "present" or "absent", edges in 
the model could also be interpreted as transcriptional reg- 
ulation. Correspondingly, nodes can represent "activated" 
protein levels or total protein abundance depending on 
context. 

We evaluate the performance of our method on simu- 
lated data, and study its behavior on networks of different 
sizes, with different amounts of available data and dif- 
ferent levels of noise, and compare results against PBTN 
[35], DEPN [34] and BDAGL, a dynamic Bayesian net- 
work based approach [40,41]. Dynamic DEPNs could not 
be used for the comparison, as they were unable to handle 
the short time courses and large number of perturba- 
tions. We show inference results for ERBB signaling in 
breast cancer using DEPN, PBTN and D-PBTN, demon- 
strating superior performance of D-PBTN, and show- 
ing cross-talk between different branches of the ERBB 
pathway. 

Methods 

Mathematical model 

We model a signaling network by a weighted, directed 
graph Q = (V, f). Cycles are permitted. The node set 
V = corresponds to proteins; an edge G £ 

from node v/ to vj represents a regulation of vj by v/, such 
as activation or deactivation via (de)phosphorylation. The 
strength and type of an interaction is specified by wij G M, 
with Wi^j > 0 for activations, ivy < 0 for inactivations 
and Wij := 0 if eij ^ S, Given N, the number of nodes 
in the network, we can thus describe the network topol- 
ogy by its N X N weighted adjacency matrix W = (wij). 
We furthermore consider each node a Boolean vari- 
able. Model time is assumed discrete, and all nodes are 
updated simultaneously between two time steps. The state 
of node Vi(t + 1) at model time ^ + 1 is described by a 
stochastic function of the states of all nodes V at model 
time t: 

p(vi(t+l) = l\V(t)) 



1 + exp (^-y (wo^i + YlJ=i ^U^jit))) 

The relationship between a target protein and its reg- 
ulators is hence modeled by a sigmoid function, an 
ansatz for nonlinear systems with saturation phenom- 
ena. Equation (1) activates or deactivates a target protein 



depending on the weighted sum of incoming regulations. 
The level of stochasticity in this process is defined by the 
positive parameter y, whereas wqj defines a basal acti- 
vation probability of Vi in the absence of any regulatory 
effects. 

In case of interventions, the state of a node v targeted by 
perturbation is no longer subject to the stochastic dynam- 
ics of the system, instead, its value is fixed through the 
intervention. We write /C^, k e {1, ...,/<r}, for the set 
of nodes targeted simultaneously in perturbation experi- 
ment /c, and assume that a knockdown of JCk amounts to 
fixing all nodes v/ g JCj^ to the "off" state at all model 
time points after the knockdown. Similarly, overexpres- 
sion experiments can be simulated by fixing the affected 
nodes Vi to "on". 

Let us now assume that we have observed experimen- 
tally the states of the nodes vi at T different (real) time 
points T}f i e {0, . . . , T}. We note in passing that this 
normally requires a discretization step, as experimental 
data are typically measured continuously - see the section 
on the ERBB data set below. Let us for now assume that 
the set of experimental observations V consists of mea- 
surements of N discrete protein states at T time points 
after K different perturbations. We write d^(i) e {0, 1} 
for the observed activity of node v/ at the real time point 
7J after knockdown of node set /C/^. To map real time 
7J to model time we introduce a parameter vector 
r e to denote the number of model time steps that 
correspond to the real time elapsed between the differ- 
ent experimental time points 7^, that is, denotes the 
number of model time steps according to equation (1) 
required to transition from real time point 7^_^ to time 
point 7^. 

Assuming a one-to-one correspondence between model 
time and real time, that is, assuming that = I for all 
i, we can define a likelihood by calculating the product of 
p(vi(t + 1) = d^(t + l)\d'!^(t)) according to equation (1) 
over all nodes v, time points t and perturbations k: 

piv\n) =1111 n 

k=l f=0 i=l...Ni^lCk (2) 
p (vi(t + 1) = 4(t + l)\v(t) = d\t)) . 

If Xt > 1, one or more intermediate model time steps 
must be made to transition from experimental time 
point T2_i to experimental time point 7^. As no exper- 
imental observations are available for the intermediate 
time steps, we must marginalize over unobserved time 
steps in the evaluation of the likelihood (2). Writing 
v(t - 1), vHO, v^(?), . . . , v^i{i), v{i) for the full sequence 
of model steps, where the sequence of intermediate steps 
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V = v^(i),v^(t), . . . ,v'^t(i) are unobserved, equation (2) 
becomes 



K T 

p(v\Q) = Y\Y\ 

k=l t=l 



p{v'Ct)yCt))'"PiyCt))\v'Kt)) 



(3) 



where /?(v^|v^) = Vu=iP(y^i\^^) comprises the product 
over all individual nodes as in equation (1), and where 
the sum is over all possible combinations of values for the 
intermediate model steps. Missing values in the experi- 
mental data can be treated similarly - the likelihood can 
then be computed by integrating equation (3) over all 
possible combinations of values for unobserved <if 

We note that PBTNs are based on a closely related likeli- 
hood function, and employ the same underlying model for 
the relation between nodes, given in equation (1). How- 
ever, PBTNs differ in a key aspect from the framework 
presented here: PBTNs do not allow multiple time points 
in their likelihood, but rather assume that measurements 
are made at a single time point after knockdown, usually 
taken at steady state. 

Inclusion of prior knowledge 

In many biological settings, available data are insufficient 
to unambiguously reconstruct the underlying network. In 
such situations, strict regularization of the objective func- 
tion using for example maximum parsimony [28], or the 
inclusion of additional prior biological knowledge [42] 
can help. Both can be done via Bayes' theorem using 
prior distributions on the model parameters The model 
posterior is 



(4) 



where p(V\Q) is the likelihood function as defined above, 
pm)\s an adequately defined prior distribution, and p(T>) 
is a constant normalizing factor that does not depend on 
^2 and can be neglected in maximization or sampling. 
The prior can be written as piP) — p{wQ^.)p(W)p{x), 
assuming that the different types of model parameters 
are independent. Then, p(yV) describes our belief about 
the correct topology, prior to seeing any data, and p{x) 
describes our belief about the speed of signal transduction 
in the network. Assuming independence of parameters, 
the full prior can be written as: 



pio) = Y\p Y\p (^v) ^^^^ • 



(5) 



In the network inference setting, it is unlikely that the true 
underlying biological network is densely connected. We 



rather expect a sparse network, where most of the nodes 
have only few other nodes acting on them [43-47]. This 
can be mathematically encoded in a prior on Wi^j 



p(wi,j\q,s) = exp - Mv'l^^ ' 



(6) 



with positive shape and rate parameters q and s [48,49]. 
Note that, after taking the negative logarithm and drop- 
ping the normalizing factor l/(qs^), this is equivalent to 
regularization using the L^-norm, and corresponds to a 
Laplacian prior if ^ = 1 or a normal prior if ^ = 2, 
and the parameter s determines the width of the distribu- 
tion. The parameters /x^^y can be used to encode additional 
knowledge on specific edges, by setting individual fiij to 
nonzero values. If no prior knowledge is available, fiij 
defaults to 0. We note that the can be chosen continu- 
ously depending on the expected strength of the effect and 
certainty in its presence. We expect signaling molecules 
to be "off" in the absence of any signal, and therefore use 
independent negative gamma priors for p(wo,i)y 



p(m,i) 



(7) 



with rate and shape parameters ro and 5o, respectively. We 
hence assume wo,i to be non-positive. This assumption 
can easily be replaced if signaling molecules should also 
be allowed to be "on" in the absence of signal, for example 
by using a gamma prior on the absolute value \wo,i\, or a 
normal prior on wq,/. Lastly, as model time is discrete and 
necessarily positive, we use a binomial prior p(ri) on r/. 



with parameters n and p. 



(8) 



Network inference 

We could now maximize p{Q.\V) to find the most prob- 
able network underlying the data V, This is rea- 
sonable if the posterior is unimodal and has a sharp, 
narrow optimum. However, since the optimization prob- 
lem encountered for real data is typically underdeter- 
mined, several alternative models often explain the 
data equally well, and marginal distributions for param- 
eters show wide peaks and large confidence intervals. 
We therefore use Markov chain Monte Carlo (MCMC) 
sampling to fully characterize the posterior. The main dif- 
ficulty in doing this lies in the calculation of the likelihood 
p{V\Q), which due to the required marginalization over 
unobserved nodes and time points quickly becomes very 
involved and time-consuming, compare equation (3). We 
therefore approximate p(V\Q) by simulating the dynam- 
ics of the model with parameters for each knockdown 
/C^, and then compare the simulation results M]^{t) at each 
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time point t with the experimental data d^(t). If this is 
repeated a large number of times, we can approximate 
p(I)\Q) by the number of times we get V back in the sim- 
ulation runs. By combining this with MCMC sampling, 
a very efficient approach for evaluation of the poste- 
rior distribution arises, that does not require an explicit 
computation of the likelihood [50] . In contrast to PBTN, 
which employed a Metropolis -Hastings based sampler, 
we have now implemented a far more efficient sampler 
in D-PBTN. We employ distributed evolutionary Monte 
Carlo (DEMC) [51], a sampler that combines features of 
a distributed genetic algorithm with MCMC sampling. In 
brief, DEMC starts with a population of m Markov chains, 
which are grouped into g subpopulations. Each individ- 
ual chain describes a specific network Q. Updates within 
each subpopulation are done using the genetic operators 
mutation and cross-over, and migration allows individual 
solutions to move between subpopulations. Thereafter, 
each chain is scored by initializing each node in the net- 
work with the values experimentally observed at time 0, 
and then simulating using equation (1). This is followed 
by a Metropolis-Hastings step to accept or reject the 
new state, to ensure ergodicity of the chains. A detailed 
description of the procedure is given in Additional file 1. 

Data simulation 

Since for real experimental data typically no "gold stan- 
dard" network is available to assess results, we used 
simulated data to evaluate D-PBTN. This allows it to 
systematically alter properties of the data, such as the 
inherent level of noise or amount of missing values, and 
to directly assess the performance impact this has. We 
simulated data in three different ways: 

Simulated Network 1 (SNl) is a 7-node feedforward net- 
work, shown in Figure lA (left). Weights Wij were set to 
1 for edges shown in the figure, and 0 otherwise. Baseline 
weights were set to wqj = —0.25 to have a high prob- 
ability for an unregulated protein to be in the off state. 
Data simulation for this network was performed assum- 
ing that all proteins are in the off state initially, except for 
the receptor 1. Stochastic signaling is then simulated using 
equation (1), updating all nodes simultaneously, and using 
y = 10. Knockdowns were simulated by fixing targeted 
proteins in the off state. We simulated knockdowns of all 
individual proteins, and two combinatorial knockdowns 
of proteins (3, 4) and (4, 6). Two time points were used for 
network inference, the first one immediately after knock- 
down and activation of the receptor; the second time point 
after 6 simulation steps, allowing for sufficient time for the 
signal to propagate through the whole network. 

Using the same network topology and parameters, 
we furthermore simulated overexpression experiments 
{Simulated Network la, SNla), Overexpressed nodes were 
fixed to the on state, while we assumed all other nodes to 



be inactive initially. We simulated overexpression exper- 
iments of all individual nodes, as well as combinatorial 
overexpression of nodes (3,4) and nodes (4,6), again by 
running the simulation over 6 time steps as above. 

Simulated Network 2 (SN2) extends SNl by a feedback 
loop, see Figure lA (right). Data for SN2 was simulated 
using the model proposed by Frohlich et al. [34]. In con- 
trast to SNl, signaling is deterministic, and noise arises 
only at the measurement stage. We initialized all nodes as 
above and assumed that a node deterministically becomes 
active if there are more activating than inhibiting pro- 
teins among its parents. Measurements were simulated by 
sampling from a normal distribution with A/'(0.95, 0.01)- 
distributed mean for active nodes, and A/^(0.6, 0.01)- 
distributed mean if the node was inactive. The variances 
of the normal distributions were drawn from an inverse 
scaled chi-squared distribution /wv^2 (4.4, 0.023), as sug- 
gested by Frohlich et al. [34] based on their experimental 
observations. Measurements were simulated immediately 
after the knockdown and activation, and after four and six 
simulation steps. 

Simulated Network 3 (SN3) comprises a set of ten 
different networks randomly extracted from the KEGG 
database, each containing seven connected proteins, tak- 
ing only protein-protein interactions into account. We 
simulated time-course data for 10 time points as described 
for SN2 for each of the ten subnetworks. For each sub- 
network, in addition to the starting point, two further 
randomly selected time points were included into the final 
data set used for network reconstruction. Reported per- 
formance results are average performances over all ten 
subnetworks, avoiding biasing of results towards a single 
topology or time interval. 

Implementation and performance evaluation 

We implemented D-PBTN in CfJ, and evaluated the app- 
roach on both simulated and real experimental data. 
Source code (tested on the Windows platform) and addi- 
tional information is available from our website at http:// 
www.kaderali.org/software.html. 

Convergence of the Markov chains was assessed using 
the methods proposed by Brooks and Gelman [52], results 
are given in Additional file 1. Sampled networks were 
aggregated using halite clustering [53] . Halite is a density- 
based clustering method, that we use to identify clusters 
of similar networks from the Markov chains. Cluster rep- 
resentative networks were computed using the within- 
cluster median of Wij. 

To quantify network reconstruction performance, we 
performed receiver operator characteristic (ROC) and 
precision recall (PR) analysis. In brief, a threshold 8 is used 
for discretization of edge values, where the median Wi^ 
of sampled values for a particular edge Wij is compared 
with 5. If \Wij\ > 8y the edge eij is assumed present in 
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Figure 1 Evaluation on simulated data. A: The panel shows network topologies used to simulate data. Simulated Network 1 (SNl) is a simple 
feedforward network, whereas Simulated Network 2 (SN2) contains a negative feedback loop. B: Network reconstruction was performed for SNl, 
changing the number of unobserved proteins. Shown is the distribution of the area under the ROC curve (AUCroc) of 100 replicates of simulated 
data sets, over the fraction of unobserved proteins. From left to right: D-PBTN, DEPN, BDAGL, PBTN. The dashed line at /\[7Qoc = 0.5 shows 
expected results for random guessing. C: The panel shows the distribution of AUCroc values obtained for different levels of noise on SNl . Noise is 
introduced by switching the state of the indicated fraction of proteins in the "measured" data, thus introducing errors in the data. D: This panel 
shows performance of D-PBTN, DEPN, BDAGL AND PBTN on networks sampled from KEGG, with data simulated as described in methods. Shown are 
AUCroc values ofl 00 simulated data sets, generated from ten different KEGG subnetworks. 



the network, eij e 6, and absent otherwise, eij ^ S, 
The resulting network with edge set £ is then compared 
against the gold standard network, and sensitivity, speci- 
ficity and precision are computed for given 8, This is 
then repeated by varying 5, and sensitivity is plotted over 
specificity for different 5 in a ROC plot, and precision 
over sensitivity in a PR plot. Finally, both ROC and PR 
curves can be summarized by computing the area under 
the curve {AUCroc and AUCpr, respectively) as single 
numbers summarizing performance over a wide range of 



threshold values 5. The advantage of this approach is that 
it is not necessary to pick a specific threshold 5 for the dis- 
cretization, which may be difficult to do as this depends 
on (unknown) user preferences regarding the tradeoff 
between false positive and false negative edges, but the 
analysis summarizes performance over the full range of 
feasible thresholds 5. 

We first assessed performance of D-PBTN on sim- 
ulated data, varying the amount of noise in the data, 
using different amounts of simulated data and different 
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models for data simulation, and evaluated stability of 
results for changing model hyperparameters. We then 
assessed performance on publicly available data regard- 
ing signaling in the ERBB network. Inferred networks on 
real data were assessed using STRING 9.0 as reference 
[54], using only edges with at least 70% confidence in 
STRING. Since interactions given in STRING are undi- 
rected and unsigned, we disregarded directional infor- 
mation and edge sign. We compare performance of our 
approach with DEPN [34], the non-time-series version of 
our approach (PBTN) [35], the Bayesian Directed Acyclic 
Graph Learning tool (BDAGL) developed by Eaton and 
Murphy [40,41], and with random guessing. Notably, D- 
DEPN could not be used on the simulated data due to 
the nature of the short time courses used. Results for 
PBTN were obtained using a modified version of the 
original PBTN implementation [35] that optimizes the 
posterior distribution instead of sampling, thus allowing 
for a more efficient computation on networks with a uni- 
modal posterior. This compromise is necessary, as the 
original sampling-version of PBTN is too slow to allow 
a rigorous evaluation on networks of the size used here. 
Time points were used as independent measurements in 
PBTN. To obtain results for the DEPNs, we used greedy 
hill-climbing and bootstrapping with 100 bootstrap sam- 
ples, as proposed by the authors of the method. The 
bootstrap samples were used to obtain weights on the 
edges, which were subjected to ROC and PR analysis. We 
note that DEPNs operate on equivalence classes of graphs, 
and can not distinguish between graphs within an equiv- 
alence class. We therefore used the transitively closed 
network as a representative of the full equivalence class 
to evaluate the DEPN results. This is in contrast to the 
other methods, which in principle are able to determine 
a unique network, provided sufficient data is given, and 
which we compared directly to the gold standard network. 
Lastly, results for BDAGL were obtained using the modi- 
fied marginal likelihood method for perfect interventional 
data, with uniform prior. 

Results 

As a first test of our method, we assessed the performance 
of D-PBTN on simulated data, without noise and with 
full observations of all nodes, using the network topol- 
ogy SNl. To compare performance of D-PBTN with other 
approaches, we used the non-time series version PBTN, 
DEPN and BDAGL. For D-PBTN, shape and rate param- 
eters of the prior distribution (6) on w were set to q,s = 
1, respectively, corresponding to a standard LI regular- 
ization of the edge weights as the simplest conceivable 
setting. Rate and shape parameters of the negative gamma 
prior on wo,i were set to ro = 8, 5o = 16, resulting in 
a moderate deactivation of unregulated proteins, and a 
binomial prior with n = 20yp = 0.3 was used for r, giving 



an expected value of r of 6 time steps. The stochasticity 
parameter y was set to 1.8, corresponding to an average 
level of noise in the data. Three sub-populations with four 
Markov chains were used with 10^ steps each and a burn- 
in of 5,000 steps. Parameters for PBTN were set accord- 
ingly, using the two time points as independent replicate 
measurements. Results for all four approaches are shown 
in Table 1, values shown are median values of the AUCroc 
2ind AUCpR measures over 100 replicate simulations. With 
no noise and no missing values, D-PBTN shows superior 
performance over the other three approaches on the simu- 
lated data. Notably, the comparison between D-PBTN and 
PBTN shows the added value of the temporal information, 
indicating that while a significant part of the information 
in the data comes from the knockdowns, even short time 
courses with just 2 time points are of added value. BDAGL 
shows weaker results than the other three approaches, 
both in terms of the AUCroc and AUCpr, possibly 
due to a significantly different underlying mathematical 
model. 

Effect of missing data 

We next assessed the effect of missing observations on 
network inference, using the SNl data. Results are shown 
in Figure IB iov AUCroc and Additional file 2: Figure SI 
for AUCpR, and are summarized in Table 1. Note that 
BDAGL could not be used in this comparison, as the 
method cannot handle missing values. We randomly 
selected 16%, 33% and 50% of the nodes, and completely 
removed the corresponding simulated observations for 
these nodes before carrying out the network inference. 
Model parameters were set as described above. Computa- 
tion took approximately 7.4 hours for D-PBTN on a 3 GHz 
Intel 64-bit processor (single-threaded). For comparison, 
runtime for DEPN was only around 30 minutes. Regarding 
quality of the inferred networks, D-PBTN shows superior 
performance over DEPN, both with respect to AUCroc 
and AUCpR, PBTN did not perform better than random 
guessing on all runs with missing values (AUCroc ^ 0.5). 
Both D-PBTN and DEPN show a decrease in the qual- 
ity of the inferred network with increasing amounts of 
missing observations. In spite of this, even with 50% of 
missing data, both approaches still perform significantly 
better than guessing (both p < 0.0001, Welch's t-test 
on AUCroc values). Median AUCroc decreased from 
0.85 to 0.58 (D-PBTN), 0.77 to 0.53 (DEPN) and 0.75 to 
0.52 (PBTN) when comparing the 0% to the 50% miss- 
ing value performance. For each considered amount of 
missing data, D-PBTN outperformed DEPN (0% miss- 
ing: AUCroc 0.85 vs. 0.77, p < 0.0001; 16% missing: 
AUCroc 0.72 vs. 0,68, p < 0.0001; 33% missing: AUCroc 
0.63 vs. 0.58, p < 0.0001; 50% missing: AUCroc 0.58 vs. 
0.53,/? = 0.0029); results are comparable for AUCrr, see 
Table 1. 
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Table 1 Performance comparison of D-PBTN, PBTN, DEPN and BDAGL on simulated data 



AUCroc 



Network 








SN1 








SN2 


SN3 


Noise 


0% 


16% 


33% 


50% 


0% 


0% 


0% 


0% 


0% 


Missing 


0% 


0% 


0% 


0% 


16% 


33% 


50% 


0% 


0% 


D-PBTN 


0.85 


0.68 


0.64 


0.56 


0.72 


0.63 


0.58 


0.78 


0.66 


PBTN 


0.75 


0.51 


0.51 


0.51 


0.46 


0.51 


0.52 


0.75 


0.65 


DEPN 


0.77 


0.63 


0.58 


0.54 


0.68 


0.58 


0.53 


0.60 


0.53 


BDAGL 


0.66 


0.66 


0.65 


0.65 


- 


- 


- 


0.75 


0.47 












AUCpR 










Network 








SN1 








SN2 


SN3 


Noise 


0% 


16% 


33% 


50% 


0% 


0% 


0% 


0% 


0% 


Missing 


0% 


0% 


0% 


0% 


16% 


33% 


50% 


0% 


0% 


D-PBTN 


0.80 


0.48 


0.43 


0.33 


0.52 


0.41 


0.36 


0.58 


0.45 


PBTN 


0.79 


0.58 


0.58 


0.58 


0.18 


0.20 


0.23 


0.75 


0.58 


DEPN 


0.69 


0.41 


0.33 


0.29 


0.39 


0.35 


0.31 


0.41 


0.18 


BDAGL 


0.19 


0.19 


0.19 


0.20 








0.53 


0.28 



Shown are achieved values of the area under the ROC curve {AUCroc) and the area under the PR curve {AUCpr). Values shown are median values over 1 00 iterations. 
Inference was performed on the SN1 data, with data with different levels of noise and missing values, on the SN2 data including a negative feedback loop, and on the 
SN3 (KEGG) data set. Note that BDAGL cannot handle missing values. The upper part of the table shows the area under the ROC curve, while the lower part of the table 
shows the area under the PR curve. 



Effect of noise 

We next tested the effect of noise on inference perfor- 
mance, see Figure IC for AUCroc, Additional file 3: 
Figure S2 for AUCpr, and the summary in Table 1. To 
simulate noise, we randomly changed individual measure- 
ments in SNl from 0 to 1 and vice versa. Network infer- 
ence was performed on the resulting data (with no missing 
values), with parameters as described above, and resulting 
in comparable running times. All four approaches showed 
decreasing performance for increasing levels of noise, 
with median AZ/Q^oc decreasing from 0.85 (0% noise) to 
0.56 (50% noise) for D-PBTN, from 0.77 to 0.54 for DEPN, 
from 0.75 to 0.51 for PBTN, and from 0.66 to 0.65 for 
BDAGL. D-PBTN was the best performing method using 
AUCroc performance measure, whereas PBTN was 
not better than guessing, with median AUCroc = 0-51 
already at 16% noise. In contrast to this, we observed 
good AUCpR performance of PBTN, with AUCrr = 0.58 
at all noise levels. Directly comparing D-PBTN with the 
non-time-series version PBTN, we observed thdit AUCroc 
results for PBTN were inferior to D-PBTN at all noise lev- 
els, whereas PBTN showed better AUCrr on data with 
higher levels of noise. This is a very intersting observation, 
which we discuss further in the Discussion section below. 
BDAGL showed very robust performance for increas- 
ing noise levels, consistently yielding low, but stable 
AUCroc ^ 0.65, but with very low AUCrr ^ 0.19. We 
observed a further performance breakdown for BDAGL 
when noise levels increased beyond 50% (data not shown). 



Finally, comparing D-PBTN with DEPN, even with 50% of 
the data wrong, both approaches are significantly better 
than random guessing (both p < 0.0001), and D-PBTN 
outperforms DEPN for all noise levels (0% noise: AUCroc 
0.85 vs. 0.77, p < 0.0001; 16% noise: AUCroc 0.68 vs. 0.63, 
p < 0.0001; 33% noise: AUCroc 0.64 vs. 0.58,;? = 0.0014, 
50% noise: AUCroc 0.56 vs. 0.54, p = 0.0052); results for 
AUCpR were qualitatively equivalent, see Additional file 3: 
Figure S2 and Table 1. 

Overexpression experiments 

As a further test of the method, we assessed the use of 
overexpression data with D-PBTN, using the SNl a data 
set. Model parameters were chosen as for the knockdown 
experiments above. 

Results were comparable to results obtained using 
the knockdown data, with AUCroc = 0.88 (Knockdown: 
0.85) and a AUCrr = 0.79 (Knockdown: 0.80) for D- 
PBTN. For comparison, we also used DEPN with the 
overexpression data. This resulted in 2in AUCroc = 0.63 
and AUCpR = 0.62, both inferior to the D-PBTN results. 
We note that neither BDAGL nor PBTN in its original 
implementation are able to handle overexpression data, 
and could therefore not be included in the comparison. 

Effect of networic topology and model assumptions 

Data SNl were simulated using the model underlying our 
approach. Results may thus favor D-PBTN and PBTN. 
Furthermore, artificial topologies may not be represen- 
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tative of real biological networks. We therefore sampled 
"real" subgraphs from the KEGG database, and simulated 
data using a different model (section "Data simulation"). 
The obtained data set SN3 was used to assess performance 
of D-PBTN, PBTN, BDAGL and DEPN. D-PBTN prior 
and MCMC and PBTN parameters were chosen as above, 
except for the parameter p of the binomial prior on r, 
which was set to p = 0.2, resulting in a slightly smaller 
expected value for r. Results for BDAGL and DEPN were 
obtained as described above. 

Figure ID and Additional file 4: Figure S3 show AUCroc 
and AUCpR results obtained, respectively, and all results 
are summarized in Table 1. Median AUCroc values 
achieved were 0.66 (D-PBTN), 0.65 (PBTN), 0.53 (DEPN) 
and 0.47 (BDAGL); Median AUCrr values were 0.45 (D- 
PBTN), 0.58 (PBTN), 0.18 (DEPN) and 0.28 (BDAGL). 
AUCroc results for D-PBTN, PBTN and DEPN are statis- 
tically significantly better than guessing (p < 0.0001). D- 
PBTN outperformed DEPN both with respect to AUCroc 
and AUCpR (Welch two sample t-test, both p < 0.0001). 
We observed that DEPN had particular difficulties when 
measured time points were far apart in time, with sev- 
eral unobserved intermediate steps between them. D- 
PBTN, which in contrast to DEPN were specifically 
designed for time course data, did not exhibit this prob- 
lem. Notably, the non time-series method PBTN showed 
only marginally inferior AUCroc than D-PBTN (median 
AUCroc = 0.66 D-PBTN vs. 0.65 PBTN, p = 0.051), 
but superior performance in terms of median AUCrr 
(0.45 D-PBTN vs. 0.58 PBTN). However, in contrast to 
D-PBTN, PBTN results are characterized by high variabil- 
ity (see Additional file 4: Figure S3, inter quartile range 
(IQR) AUC^^^^™ = 0.155, IQR AUC^™ = 0.431), 
and mean AUCrr values are very similar between the two 
approaches (0.457 D-PBTN vs. 0.46 PBTN). A statistical 
test comparing the AUCrr values from the 100 replicate 
runs of the two approaches shows no significant difference 
in performance measured by AUCrr (p = 0.9156, Welch 
two sample t-test). 

We next evaluated the effect of negative feedback loops 
on network inference, using data set SN2. This is a diffi- 
cult problem, as the feedback can only be inferred from 
the temporal evolution of the network. Parameters for 
Markov chains and prior were set as above, using 5 sub- 
populations with 3 chains each in sampling. This resulted 
in a running time of 16 hours (D-PBTN). Using the full 
data set, D-PBTN achieved a median AUCroc of 0.78 
and median AUCrr of 0.58; DEPNs achieved a median 
AUCroc of 0.60 and median AUCrr of 0.41. Median 
AUCroc performance values of PBTN and BDAGL were 
both 0.75, and were superior to DEPN, but inferior to D- 
PBTN. In line with results on data set SN3, we observed 
good AUCpR performance of the non time-series method 
PBTN (median AUCpr 0.75 PBTN vs. 0.58 D-PBTN), 



however, in contrast to D-PBTN, PBTN could not infer 
the feedback loop between nodes 7 and 4. 

Additional file 5: Figure S4A shows the distribution of 
edge weights obtained with D-PBTN, indicating that the 
posterior distribution (4) is unimodal. Using only obser- 
vations of proteins 6 and 7, unique networks can no longer 
be recovered (Additional file 5: Figure S4B). Probabili- 
ties for individual network topologies or even edges can 
be computed based on the sampled points, and can then 
be used e.g. to design experiments to resolve the true 
underlying network (Figure 2). 

Robustness with respect to model hyperparameters 

Since our model contains parameters to be set by the user, 
we next analyzed how robustly networks were inferred for 
varying model hyperparameters, using data set SNl. We 
varied y in equation (1) by up to ±50%. For changes of 
y by the maximum change of ±50% tested, the change 
observed in inference performance was a decrease of 
23.8% for AUCroc. and 26.5% for AUCrr (Additional 
file 6: Figure S5). Of note, even for a change by 50%, D- 
PBTN still performed significantly better than random 
guessing {AUCroc 0.643, AUCrr 0.593). We furthermore 
studied the influence of changing the prior hyperparam- 
eters q and s of the prior p(wij) (equation (6)), changing 
them simultaneously by up to ±20% (Additional file 7: 
Figure S6, panel A and B), or individually by up to 50% 
(supplementary Figure S6, panel C and D). This resulted 
in a maximum change of 24.7% in AUCroc and 22.3% 
in AUCpR for individual changes of q by 50%, decreas- 
ing AUCroc to 0.64 and AUCrr to 0.59, and changes by 
11.7% and 12.6% for AUCroc and AUCrr, respectively, 
for changes of s by 50%. We then tested the influence of 
changing the parameters n and p of the binomial prior on 
r by 50%, observing only a relatively minor impact of these 
parameters on performance (changes in AUCroc between 
0.797 and 0.856, changes in AUCrr between 0.719 and 
0.824). A more significant influence was observed for the 
parameters ro and 5o of the negative gamma prior on wq, 
which influences the baseline activity level of unregulated 
nodes, see Additional file 8: Figure S7. Overall, robust- 
ness analysis indicates that inference is reasonably stable 
for varying hyperparameters; still, some care should be 
exercised in choosing values. 

Inferring ERBB-mediated signal transduction 

To test D-PBTN on real data, we used publicly avail- 
able data from Frohlich et al [34], regarding 16 proteins 
involved in ERBB signaling. The data comprises 3 repli- 
cate measurements for 10 of the 16 proteins, measured 
after 16 different knock-downs using RNA interference, 
including 3 combinatorial knockdowns. Measurements 
were taken at two different time points, directly before 
stimulation with EGF, and after 12 hours of stimulation. 
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Figure 2 Inference of negative feedbacic cycles. The Figure shows reconstructed network topologies on simulated data SN2, using the full data 
set (panel A) and partial data with observations only for proteins 6 and 7 (panel B). In the latter case, no unique network can be identified anymore. 
Clustering of sampled w groups topologies according to structural similarity. Only one cluster arises for the complete data (panel A), whereas four 
major different clusters arise on partial data (panel B). Shown are median networks from the four clusters, together with cluster probabilities 
computed as the fraction of networks in the clusterto the total number of sampled networks. Edge thickness indicates strength of support for the 
edge within the cluster (thick solid: > 80% of samples within cluster, thin solid: > 60%, dashed: > 40% support), black lines indicate activations, red 
lines inhibitions. 



using reverse phase protein arrays. We discretized the 
normalized data using the midpoint of the negative and 
positive control medians as discretization threshold. Net- 
work inference was performed in 10 subpopulations with 
5 chains each, sampling over 10^ steps with a burn-in 
of 8,500 steps. Parameters q and s were set to 0.35 and 
14, respectively, corresponding to a fairly strict regular- 
ization. We set the stochasticity parameter y to y = 8, 
to obtain a slightly more deterministic behavior of the 
model than in case of the simulation study. Rate and 
shape parameters of the gamma prior p(wo^i) were set to 
ro = 10, 5o = 17, keeping unregulated nodes in the "off" 
state, and parameters of the binomial prior p(t) were set 
to w = 20 and p = 0.4, allowing for a relatively large 
number of model steps between the experimental mea- 
surements. Computation time took approximately 1590 
minutes, or roughly 26.5 hours. Clustering of the 10^ sam- 
pled weight vectors w shows only a single cluster with high 
probability, indicating a unimodal posterior. To obtain a 
single network for visualization purposes and further bio- 
logical discussion of inferred edges, we used a threshold 
8 = 0.65 X max|iv/,y | for network discretization, includ- 
ing only edges with sample median wq > 8 in the further 
evaluation - in words, edges were only included if the 
median of all points sampled for the corresponding edge 
was at least 65% of the maximum edge value sampled. 
This results in a network with comparable number of 
edges as the gold standard network shown in Additional 
file 9: Figure S8, constituting a tradeoff between false 
positive and false negative edges. We compared results 
obtained using D-PBTN with the non-time series ver- 
sion PBTN [35], as well as DEPN. Results for DEPN were 
taken from the original publication [34], the authors do 
not report AUC ROC and PR values for their method. 
Figure 3A shows the resulting network using D-PBTN, 



and Table 2 summarizes the results obtained using D- 
PBTN, PBTN and DEPN. Notably, without inclusion of 
any additional prior knowledge, both D-PBTN and DEPN 
show similar specificity and accuracy, but D-PBTN is both 
more sensitive and has higher precision than DEPN. The 
dynamic version D-PBTN outperform PBTN in all perfor- 
mance measures used, showing the additional value of the 
temporal information in the data when using this method. 

The inferred network (without using a network prior) 
is shown in Figure 3A, and displays significant cross-talk 
in ERBB-signaling. At the surface receptor level, D-PBTN 
predicts cross-talk between EGFR, ESR and IGFR, as 
recently confirmed experimentally [55]. We furthermore 
predict a direct interaction between AKT and MAPK, 
linking the corresponding pathways. In fact, involve- 
ment of MAPK signaling in AKT activation was recently 
reported [56]. At the transcription factor level, D-PBTN 
predicts an interaction between CDK4 and MYC, in line 
with observed expression changes of CDK4 and CDK6 
following MYC degradation [57]. 

Inference quality improves dramatically when prior 
knowledge is provided. We used the literature network 
presented in Frohlich et al, [34] as prior knowledge, with 
q = 2 and 5 = 1, and ijlq = +3 in case of an activation, —3 
in case of an inhibition, or 0 in case of no regulation in 
the prior network. We note that this prior network alone 
yields superior results than D-PBTN without any prior 
knowledge (compare Table 2), emphasising the impor- 
tance of using a network prior on this data set. Network 
inference was performed as above, resulting in the net- 
work depicted in Figure 3B. Obtained results are superior 
to both, using prior knowledge alone, as well as using 
only the experimental data (Table 2). To assess robustnes 
of these inference results with respect to model hyperpa- 
rameters, we furthermore performed a sensitivity analysis 
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Figure 3 Reconstructed ERBB-mediated signaling networlc.The plots show inferred network topologies for reverse phase protein array data 
acquired after stimulation of ERBB signaling and knockdowns of proteins involved in ERBB signaling. True positive edges, using high-confidence 
interactions in the STRING database as reference, are shown as solid black lines, newly predicted edges are shown as orange lines. True and false 
negatives are not depicted for the sake of readability of the plot. Edge thickness indicates strength of support for the edge (thick solid: > 80% of 
samples, thin solid: > 60%, dashed: > 40% support). A: Network inferred without using any prior knowledge, B: Network inferred using the literature 
interactions reported by [34] as prior knowledge for network inference. 



by modifying model hyperparameters individually by up 
to ±50%, and evaluated changes in resulting AUG ROC 
and AUG PR, compare Figure 4A and B. This analysis 
indicated that results are relatively stable, with the most 
influential parameter being the stochasticity parameter y. 
We evaluated predictions using IPA (Ingenuity), and 
could find additional evidence for most of the predicted 
edges: D-PBTN predicts regulations of IGFR, MAPK 
and CDK6 by ERBB3. Indeed, binding of ERBB3 and 
IGFR has been shown in lysate from SkBrS cells resistant 
to trastuzumab by immunoprecipitation [58]. Similarly, 
there is experimental evidence for an activation of MAPK 
by ERBB3, as we predicted [59,60]. The inferred acti- 
vation of GDK6 by ERBB3 to our knowledge has not 
been reported before, and may warrant further exper- 
iments to confirm or falsify this prediction. Some evi- 
dence also exists for the predicted strong activation of 
MYG by ESR. Both proteins have been shown by affinity 



chromatography to bind to one another [61], and blocking 
of ESRl was recently shown to decrease MYG expression 
[62]. Similarly, activation of AKT increases GGND and 
MYG expression [63,64], and a regulation of GGNE by 
MYG [65,66] and the activation of GGND by MAPK [67] 
have been demonstrated experimentally. Interestingly, D- 
PBTN predicts an activation of MYG and GGNE by IGFR. 
Both interactions to our knowledge have not been pre- 
viously reported, and may be interesting candidates for 
experimental validation. 

Discussion 

In this manuscript, we present D-PBTN, a novel method 
to infer signal-transduction networks from time course 
perturbation data. We evaluate D-PBTN in an exten- 
sive simulation study, and demonstrate its application on 
experimental data regarding ERBB signaling. The simula- 
tion study shows stable inference results in light of missing 



Table 2 Performance comparison ERBB signaling network 




Sensitivity 


Specificity 


Accuracy 


Precision 


AUCroc 


AUCpR 


D-PBTN 


0.44 


0.83 


0.75 


0.41 


0.72 


0.64 


PBTN 


0.20 


0.77 


0.66 


0.20 


0.52 


0.31 


DEPN 


0.26 


0.86 


0.73 


0.33 


not avi. 


not avi. 


D-PBTN (+Prior) 


0.59 


0.90 


0.84 


0.62 


0.78 


0.70 


DEPN (+Prior) 


0.59 


0.87 


0.81 


0.55 


not avi. 


not avi. 


Prior network 


0.48 


0.87 


0.79 


0.5 







Achieved sensitivity, specificity, accuracy and precision for the inference of ERBB signaling [34]. Shown are results of D-PBTN, PBTN and DEPN, with and without prior 
information (+Prior). The last row shows results of the prior network alone. Network inference was assessed using STRING as gold standard. Results for DEPN were 
taken from Froehlich et al. [34], authors do not report /A L/Croc or AUCpr values. 
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Figure 4 Hyperparameter sensitivity analysis ERBB network. The Figure shows sensitivity of the AUCroc (panel A) and AUCpr (panel B) of the 
ERBB network (with literature prior) with respect to changes in model hyperparameters. Model parameters /x, s, q, y, tq and sq where changed by up 
to ±50%, and inference was repeated for each parameter choice. Shown are resulting values for the area under the receiver operator characteristic 
{AUCroc) and the area under the precision-recall {AUCpr) curves. The analysis indicates that inference results are reasonably robust to changes in 
model hyperparameters, with the stochasticity parameter y being the most critical parameter for inference performance. 



data and noise, and reasonable robustness with respect 
to model hyperparameters. On the ERBB application, we 
have run the inference with and without a network prior, 
and provide both results for comparison in Figure 3. While 
the overall topology is similar whether or not a prior net- 
work is used, there are significant differences in individual 
edges. The quality of network inference with a network 
prior clearly depends on the quality of the prior used as 
well as on the amount and quality of experimental data 
available, we therefore recommend to carefully evaluate 
the effect a given network prior has. In our example, there 
is experimental evidence for many of the predicted inter- 
actions, both from the run with and without network 
prior, directly confirming the quality of the inference. 
Furthermore, in addition to confirming known interac- 
tions, D-PBTN predicts several novel edges, which may be 
interesting candidates for experimental validation. 

Running time remains a major concern with D-PBTN, 
mainly due to the time required for sampling from the 
posterior distribution. Each single sampling step requires 
running time 0(KTN^), where K is the number of per- 
turbations, T is the number of (equidistant) time steps, 
and N is the number of nodes in the network. This can be 
easily seen from the likelihood function (2), which essen- 
tially iterates over all time points, perturbations and nodes 
in the network, and computes equation (1) in each step, 
which again iterates over all nodes. Running time for the 
likelihood dominates the time required for the prior dis- 
tribution, hence the total running time for the posterior 
is 0{KTN^), Unfortunately, the shape and complexity of 
the likelihood function, and thus the number of sampling 



steps required to sample from the posterior p{D\Q) (both 
to reach the stationary distribution of the chain, and num- 
ber of steps required to sufficiently traverse the support 
of the posterior to get a good representation of the dis- 
tribution in the sample), also depends on /C, T and N 
in a nontrivial way, and the choice of required sampling 
steps (with associated impact on total running time of 
the algorithm) is a nontrivial problem. Clearly running 
time increases linearly with the number of sampling steps, 
but how exactly the minimum number of sampling steps 
required depends on /C, T and N is unclear. 

While the present implementation of D-PBTN is a 
non-parallel implementation, evolutionary MCMC is 
straightforward to parallelize, allowing for a very efficient 
parallel implementation of the sampler. In practical terms, 
running time on the SNl example (7 nodes) was approx- 
imately 7.4 hours with the present, non-parallelized ver- 
sion of the algorithm, and computations much beyond the 
size of the ERBB network with its 16 nodes appear com- 
putationally prohibitive without further parallelization. 
An efficient parallel implementation on a larger compute 
cluster in principle should make D-PBTN suitable also for 
"larger" network inference problems with several dozen, 
possibly up to a few hundred nodes, but this is clearly not 
an approach that is suitable for networks with thousands 
of proteins, let alone proteome-wide models. 

As with all Bayesian approaches, prior distributions 
and prior hyperparameters impact results, and need to 
be carefully chosen. Reasonable values for some of the 
parameters can be estimated from biological expectations, 
for example the expected level of "connectedness" in a 
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network can be used to estimate parameters q and s of the 
prior on w, or information about the speed of signal trans- 
duction in a network could be used to estimate hyperpa- 
rameters for the prior on r. However, still a significant 
amount of "gut feeling" and experience are required to 
get good estimates. While our simulation study indicates 
reasonable stability with respect to parameter changes, 
a possible extension of our work is to employ empirical 
Bayes' approaches to set model hyperparameters [17,18], 
or, granted enough experimental data and compute time 
are available, crossvalidation-schemes could be used for 
hyperparameter estimation. 

The comparison between D-PBTN and the non time- 
series version PBTN on the different simulated data sets 
gave some very interesting results. While D-PBTN out- 
performed PBTN on all data sets when AUCroc was used 
as performance measure, PBTN yielded higher median 
AUCpR than D-PBTN on datasets with high levels of 
noise, or data simulated using a different underlying math- 
ematical model. AUCpR in contrast to AUCrqc is less 
influenced by an unequal distribution of class labels (as 
occurs in sparse networks), the superior results of PBTN 
over D-PBTN under certain conditions are therefore of 
relevance for the choice of inference method. It can be 
shown that if a ROC curve for method A dominates a 
ROC curve for method B, then also the corresponding PR 
curve for method A dominates the PR curve for method 
B, and vice versa [68]. The inverted ranking of D-PBTN 
and PBTN in AUCrqc and AUCpr implies that the ROC 
curves (and by the same argument the PR curves) of D- 
PBTN and PBTN do not dominate one another. Rather, 
there are certain domains in the ROC and PR plots where 
one or the other method becomes better, and correspond- 
ingly the curves for D-PBTN and PBTN intersect. This 
in particular occured in our simulation study for the data 
with high levels of noise. It is tempting to speculate that 
under conditions with many "wrong" data points, focus- 
ing on a stable steady state (as done with PBTN) may 
have advantages over trying to capture the full dynamics 
of the system, as D-PBTN does. This point will require 
further research in the future, to determine what exactly 
determines which of the two approaches is superior under 
given conditions, and to provide guidelines for method 
selection. On the other hand, if the interest is in feed- 
back loops and cycles in the networks, PBTN should not 
be used, as this approach cannot infer cyclic networks. 
Interestingly, on the ERBB application example, D-PBTN 
clearly outperformed PBTN both with respect to AUCrqc 
and AZiCpT?. 

Conclusion 

In this manuscript, we have developed a new method to 
infer signal transduction networks from time course per- 
turbation data. Based on relatively few time points after 



a large number of different perturbations, our method is 
able to reconstruct the underlying signaling network with 
high accuracy. The mathematical approach we employ is 
based on dynamical Bayesian networks, and assumes dis- 
crete states and time steps, at which signaling molecules 
are either "on" or "off". The approach is therefore suit- 
able to experiments where protein activation or protein 
expression is completely or almost completely switched 
off, such as with RNAi knockdowns. 

Several related approaches exist. Our work extends 
PBTN by explicitly taking time into account. In con- 
trast to PBTN, the method can thereby infer feedback 
and feedforward loops. Provided sufficient time-resolved 
measurements after (possibly combinatorial) knockdowns 
are available, a unique network can be inferred using D- 
PBTN. This is in contrast to methods such as PBTN, 
DEPN and Nested Effects Models (NEMs), which only 
return equivalence classes or single networks out of an 
equivalence class. In this regard, our method is com- 
parable to dynamic DEPNs and dynamic NEMs, which 
also explicitly take time course data into account. How- 
ever, D-NEMs have been developed for "effect" observa- 
tions after knockdowns, which only provide very indirect 
information about the underlying network. D-DEPNs, 
on the other hand, require long time series and work 
most efficiently if a relatively small number of perturba- 
tions are carried out. The typical experimental scenario 
encountered in practice is different, comprising short time 
courses after a large number of perturbations. D-PBTN 
have explicitly been developed for this situation. Due to 
the Bayesian approach pursued, prior biological knowl- 
edge can easily be integrated into D-PBTN inference, and 
marginalization over unobserved proteins or time points 
is straightforward and very efficient due to the likelihood 
simulation. 

We employ an MCMC approach to sample from the 
posterior distribution of models given the experimental 
data, which allows it to compute probability distributions 
over alternative network topologies. This is particularly 
useful if multiple models explain the data equally well, and 
can be used to design additional experiments to then dis- 
tinguish further between high-scoring topologies. Using 
halite clustering, similar models can be grouped together 
and probabilities estimated from the MCMC samples. 

Overall, D-PBTN is a powerful approach for net- 
work inference when time-resolved measurements of the 
response of a dynamical system after perturbation are 
available. Under these conditions, the approach outper- 
forms other state-of-the-art methods. Most approaches 
available in the literature consider either changes in steady 
state levels after perturbation, or the temporal dynamics 
of an unperturbed system for network inference. D-PBTN 
makes use of both the dynamical aspects and the per- 
turbation effects in network reconstruction. We present 
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an extensive simulation study to evaluate the approach, 
showing stable performance under varying conditions of 
noise and missing values in the data, and we demonstrate 
the application of D-PBTN to infer signaling networks 
in the ERBB system. Here, our approach both recon- 
firms known interactions, as well as suggesting some novel 
edges as candidates for further experimental study. 

Availability and requirements 

Project name: Dynamic Probabilistic Threshold 
Networks 

Project home page: http://www.kaderali.org/software/ 
dpbtn.html. 

Operating system(s): Tested on Microsoft Windows 
Programming language: C# 
Other requirements: None 
License: GNU GPL 

Any restrictions to use by non-academics: None 
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Additional file 2: Supplementary Figure SI. Effect of the amount of 
unobserved proteins on inference performance (SNl reconstruction). 
Shown is the distribution of the area under the precision recall curve 
[AUCpRj for 100 replicates of simulated data sets, over the fraction of 
unobserved proteins, for D-PBTN, DEPN, BDAGL and PBTN on dataset SN 1 . 
The dashed line shows expected AUCpr results for random guessing. 

Additional file 3: Supplementary Figure S2. Effect of noise on inference 
performance (SNl reconstruction). The Figure shows the distribution of 
AUCpR values obtained for different levels of noise on SNl, for D-PBTN, 
DEPN, BDAGL and PBTN. The dashed line shows expected AUCpr results for 
random guessing. 

Additional file 4: Supplementary Figure S3. Inference performance on 
the reconstruction the KEGG networks. This Figure shows performance of 
the D-PBTN, DEPN BDAGL and PBTNapproaches on subnetworks taken 
from the KEGG database, with data simulated as described in methods. 
Shown 3^eAUCpp values of 100 simulated data sets, generated from ten 
different KEGG subnetworks. The dashed line shows expected AUCpp 
results for random guessing. 

Additional file 5: Supplementary Figure S4. Inference of negative 
feedback cycles: Bean-plot showing the distributions of inferred edge 
weights. A unimodal distribution is observed with the full data (panel A), 
whereas no unique network can be identified when only observations of 
two downstream proteins are given. 

Additional file 6: Supplementary Figure S5. Effect of model parameters 
on performance: The Figure shows the effect of varying the stochasticity 
parameter on the AUG values {AUCpoc and AUCpp). Network inference has 
been done using data simulated from Simulated Network I.AUCroc values 
are shown as red squares, AUCpp values are shown as blue circles. 

Additional file 7: Supplementary Figure S6. Effect of model parameters 
on performance: The Figure shows the effect of varying the rate and shape 
parameter q and s of the prior p{w) on the AUG values. Network inference 
has been done using data simulated from Simulated Network 1. Panels A 
and B: Effect of simultaneous changes of and 5 by up to 20% on AUCpoc 
and AUCpR. Panels B and C: Effect of changing parameters q and s 
individually by up to 50%. 

Additional file 8: Supplementary Figure S7. Effect of model parameters 
on performance: The Figure shows the effect of varying the rate and shape 



parameters tq (Panel A) and Sq (Panel B) of the prior on the bias weight wq 
by up to 50%. Network inference has been done using data simulated from 
Simulated Network I.AUCroc values are shown as red squares, AUCrr values 
are shown as blue circles. 

Additional file 9: Supplementary Figure S8. Experimental reference 
network extracted from STRING for ERBB network inference. 
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